Simultaneous 2D and 3D perception for stereoscopic displays based on polarized or active shutter glasses

نویسندگان

  • Payman Aflaki
  • Miska M. Hannuksela
  • Hamed Sarbolandi
  • Moncef Gabbouj
چکیده

Viewing stereoscopic 3D content is typically enabled either by using polarizing or active shutter glasses. In certain cases, some viewers may not wear viewing glasses and hence, it would be desirable to tune the stereoscopic 3D content so that it could be simultaneously watched with and without viewing glasses. In this paper we propose a video post-processing technique which enables good quality 3D and 2D perception of the same content. This is done through manipulation of one view by making it more similar to the other view to reduce the ghosting artifact perceived without viewing glasses while 3D perception is maintained. The proposed technique includ es three steps: disparity selection, contrast adjustment, and low-pass filtering. The proposed approach was evaluated through an extensive series of subjective tests, which also revealed good adjustment parameters to suit viewing with and without viewing glasses with an acceptable 3D and 2D quality, respectively. 2013 Elsevier Inc. All rights reserved. 1. Introductio n Stereoscopic vision is one of the principal methods by which In the recent years, the number of 3D movie titles has increased considerably both at cinemas and as Blu-ray 3D discs. Moreover, broadcast of stereoscopic video content is provided commercially on a few television channels. Hence, many user side devices are already capable of processing stereoscopic 3D content whose volume is expected to rise sharply in the coming years. Preferences of customers drive the direction of improvements and novelties in different presenta tion methods of the 3D content and it is therefore important to understand the habits of viewing 3D content and mechanism s of the human vision. Psycho-visual aspects must therefore be considered when displayin g 3D content. The human vision system (HVS) perceives color images using receptors on the retina of the eye which respond to three broad color bands in the regions of red, green and blue in the color spectrum. HVS is more sensitive to overall luminance changes than to color changes. The major challenge in understa nding and modeling visual perception is that what people see is not simply a translation of retinal stimuli (i.e. the image on the retina). Moreover, HVS has a limited sensitivit y; it does not react to small stimuli, it is not able to discriminate between signals with an infinite precision, and it also presents saturation effects. In general one could say it achieves a compression process in order to keep visual stimuli for the brain in an interpretable range. ll rights reserved. (P. Aflaki). Simultaneou s 2D and 3D perce 6/ j.jvcir.2013 .03.014 humans extract 3D information from a scene. HVS is able to fuse the sensory information from the two eyes in such a way that a 3D perception of the scene is formed in a process called stereopsis. In stereoscopic presentation, the brain registers slight perspective differenc es between left and right views to create a 3D representation incorporating both views. In other words, the visual cortex receives informat ion from each eye and combines this information to form a single stereoscopic image. Presenting different views for each eye (stereoscopic presentation) usually results into binocular rivalry where the two monocular patterns are perceived alternately [1]. In such a case, where dissimilar monocular stimuli are presente d to correspondi ng retinal locations of the two eyes, rather than perceiving stable single stimuli, two stimuli compete for perceptual dominance. Rivalry can be triggered by very simple stimulus differences or by differences between complex images. These include differenc es in color, luminance, contrast polarity, form, size, and velocity. Stronger, high-contras t stimuli lead to stronger perceptu al competition. In particular cases, one of the two stimuli dominate s the field. This effect is known as binocular suppression [2,3]. It is assumed accordin g to the binocular suppression theory that the HVS fuses the two images with different levels of sharpness such that the perceived quality is close to that of the sharper view [4]. In contrast, if both views show different amounts of blocking artifacts, no considerable binocular suppressi on is observed and the binocular quality of a stereoscopic sequence is rated close to the mean quality of both views [5]. Binocular suppression has been exploited in asymmetric stereoscopic video coding, for example by providing one of the views pt ion for stereo scopic displays based on pola rized or active shutt er glasses, 2 P. Aflaki et al. / J. Vis. Commun. Image R. xxx (2013) xxx–xxx with lower spatial resolution [6] or with lower frequency bandwidth [7], fewer color quantiza tion steps [8], or coarser transform-domai n quantization [9,10]. In this paper we exploit binocular suppression and asymmetr ic quality between views in another domain, namely presentation of stereoscopic 3D content simultaneou sly on a single display for viewers with and without viewing glasses. Such a viewing situation may occur, for example, when television viewing is not active, but the television set is just being kept on as a habit. The television may be located in a central place at home, where many family members are spending their free time. Consequently, there might be viewers actively watching the television with glasses and while others are primarily doing something else (without glasses) and just momentarily peeking at the television. Furthermore, the price of the glasses, particularly the active ones, might constrain the number of glasses households are willing to buy. Hence, in some occasions, households might not have a sufficient number of glasses for family members and visitors watching the television. While glasses-b ased stereoscopic display systems provide a good stereoscopic viewing quality, the perceived quality of the stereo picture or picture sequence viewed without glasses is intolerable. Recently, authors in [11] presented a system for automatic 2D/3D display mode selection based on whether the users in front of the 3D display wear viewing glasses. In the research presented in [11] a combination of special viewing glasses and a camera on top of the display enables such display mode selection. However , this approach does not solve the problem of a mixed group of observer s, some with and some without viewing glasses and only enables switching between 2D and 3D presentation based on the number of subjects with or without viewing glasses in front of the display. We enable the same content to be simultaneously viewed both in 3D with viewing glasses and in 2D without viewing glasses by digital signal processing of the decoded stereoscopic video content, making the perceived quality in glasses-b ased stereoscopic viewing systems acceptable for viewers with and without 3D viewing glasses simultaneou sly. Viewers with glasses should be able to perceive stereoscopic pictures with acceptable quality and good depth perception, while viewers without glasses should be able to perceive single-view pictures i.e. one of the views of the stereoscopic video. The proposed processing is intended to take place at the display and can be adapted for example based on the ratio of users with and without viewing glasses. In the proposed algorithm, one of the views is processed so that its presence becomes harder to perceive when viewing the content without viewing glasses, while the quality and 3D perception is not compromised much thanks to binocular suppressi on. The proposed method includes three steps, namely disparity adaptation, low-pass filtering of the non-domina nt view, and contrast adjustment . While known methods are used for each processing step, we are not aware of previous research works tackling the same problem, i.e. stereoscopic 3D content being simultaneously viewed with viewing glasses by some users and without viewing glasses by other users. The rest of this paper is organized as follows. In Section 2 we present a literature review of the research fields related to the algorithm proposed in the paper, while the proposed post-proces sing algorithm is described in Section 3. Test setup and results are presented in Sections 4 and 5, respectively . Finally the paper concludes in Section 6. 2. Literature review In this section, we provide an extensive literature review focused on the operation of human visual system when observing an asymmetric quality stereosco pic video. Different types of asymmetry are classified and subjectiv e assessment results are reported Please cite this article in press as: P. Aflaki et al., Simul taneou s 2D and 3D perce J. Vis. Comm un. (2013), http://dx. doi.org/1 0.1016/ j.jvcir.2013 .03.014 in Sections 2.1 and 2.2 from perception and video compress ion viewpoin ts, respectivel y. Moreover, in Section 2.3, we discuss the effect of camera separation on the depth perception. These techniques provide a basis for rendering algorithms utilized in this study. In Section 2.4 we summari ze some key aspects affecting the perceived 3D video quality, which are subsequent ly taken into considerati on in the performed subjectiv e viewing experime nt. Finally, in Section 2.5, the concept of depth-enhance d multiview video coding is described, as it can provide an unlimited number of rendered views at the 3D display. This coding approach can be exploited to display stereoscopic video with arbitrary camera separations, hence facilitating the disparity adaptatio n step of the method proposed in this paper. 2.1. Visual perception of asymmetric stereoscopic video Binocular suppression provides an opportunity to use different types of asymmetry between views. Many research works have been carried out to study which types of asymmetr y are subjectively most pleasing to human observers or closest to the symmetric stereoscopic video and to find optimal settings for various paramete rs related to the strength of asymmetr y. Typically the greater the amount of high frequency components (more detail), the better the 3D perception of the objects. This means that the stereo acuity decreases when the amount of blurring increases [12]. However , [13] studied this topic in more detail showing that within certain limits, it is possible to perceive stimuli well in 3D even when one eye sees a blurred image while the other eye sees a sharper one. The capability of the HVS to fuse stereo pairs of different sharpness has been studied in many papers. Authors in [6] subjectively assessed the quality of uncompressed mixed-resoluti on asymmetric stereosco pic video by downsampling one view with ratios 1/2, 3/8, and 1/4. The results show that while downsampling ratio is equal to 1/2 the average subjectiv e score has sufficient subjective quality which is comparable to that of full resolution stereo pair. A similar experime nt was conducted by Stelmach in [14] where the response of HVS to mixed-resoluti on stereo video sequences where one view was low-pass filtered was explored by performing a series of subjective tests. Subjects rated the overall quality, sharpness , and depth perception of stereo video clips. The results show that the overall sensation of depth was unaffected by lowpass filtering, while ratings of quality and sharpness were strongly weighted towards the eye with the greater spatial resolution. Moreove r, authors in [7] evaluated the perceptual impact of lowpass filtering applied to one view of a stereo image pairs and stereoscopic video sequences in order to achieve an asymmetr ic stereo scenario. The results showed that binocular perception was dominate d by the high quality view when the other view was low-pass filtered. 2.2. Asymmetric stereosco pic video coding The types of asymmetr ic video coding can be coarsely classified into mixed-resol ution, asymmetric sample-domain quantization, asymmetr ic transform-d omain quantization and asymmetr ic temporal resolution. Furthermore, a combination of different types of scalabilit ies can be used. The different types of asymmetr ic stereoscopic video coding are reviewed briefly in the sequel. Mixed-resolution stereosco pic video coding [15], also referred to as resolution–asymmetric stereosco pic video coding, introduce s asymmetr y between views by low-pass filtering one view and hence providing smaller amount of spatial details or a lower spatial resolution. Furthermore, usually a coarser sampling grid is utilized for the low-pass -filtered image, i.e. the content is represented with fewer pixels. Mixed-resolution coding can also be applied for a pt ion for stereo scopic displays based on polari zed or active shutter glasse s, Fig. 1. Disparity calculation in arcmin based on different disparities in number of pixels on display. P. Aflaki et al. / J. Vis. Commun. Image R. xxx (2013) xxx–xxx 3 subset of color components. For example, in [16], luma pictures of both views had equal resolution while chroma pictures of one view were represented by fewer samples than the respective chroma pictures of the other view. In asymmetric transformdomain quantization the transform coefficients of the two views are quantized with a different step size. As a result, one of the views has a lower fidelity and may be subject to a greater amount of visible coding artifacts, such as blocking and ringing. In [9], the authors performed a series of subjective test experiments on coded stereosco pic video clips with asymmetric luminanc e qualities. Asymmetric luminance was achieved with coarser quantiza tion of transform coefficient values in one luma view. Subjective results show that stereosco pic video coding with asymmetr ic luminance information achieved a bitrate reduction from 9% to 34% while maintaining the just noticeable distortion as introduced in [17]. Moreover, authors in [10] subjectively compared the quality of coded mixed-resol ution stereoscopic video with that of compressed full-resolution video. The results revealed that under the same bitrate constraint, the same subjective quality can be expected while decreasing the spatial resolution of one view by a factor of 1/2 horizontally and vertically. In asymmetric sample-domain quantization [8] the sample values of each view are quantized with a different step size. A higher compression ratio can be achieved for the quantize d view compared to the other view, due to fewer quantization steps. Both luma and chroma samples can be processed with different quantization step sizes. If the number of quantization steps in each view matches a power of two, a special case of asymmetric sample-domain quantiza tion, called bit-depth-as ymmetric stereosco pic video, can be achieved. [8] presents a video coding scheme based on uneven quantization steps for luma sample values of left and right views along with spatial downsampli ng. Results of subjectiv e quality assessme nt showed that the average ratings of proposed method outperfor med full resolution symmetric and mixed resolution asymmetric stereoscopic video coding schemes with different downsampli ng ratios. To our knowledge, asymmetric contrast has not been utilized in stereoscopic video compression . However , authors in [18] subjectively assessed the subjectiv e quality of a wide range of binocular image imperfecti ons by pointing out asymmetr y threshold values which provide equal visual comfort. It was found that the contrast difference between views should not exceed 25% to prevent eye strain in subjects. 2.3. Impact of parallax on depth perception Screen parallax is created by the difference between the left and right eye images on the 3D display. We need to converge and accommodate (focus) the eyes in order to project the object of interest to the fovea in both eyes. The distance between us and the object of interest defines the amount of convergence and accommodati on in our eyes. Convergence can be defined as a process that is basically disparity driven and consists of the movement of the two eyes in opposite direction to locate correctly the area of interest on the fovea. Accommodatio n tries to remove blur and hence, alters the lens to focus the area of interest on the fovea [19]. Under natural condition s the accommodation and converge nce systems are reflexively linked. The amount of accommodati on needed to focus on an object changes proportio nally to the amount of converge nce required to project the same object on the fovea of the eyes. Under conditions of binocular fusion, for a certain amount of converge nce, accommodati on has a certain depth of focus, in which it can move freely and objects are perceived properly [20]. An area defining an absolute limit for disparities that can be fused in HVS is known as Panum’s fusional area [21,22]. It describes an area, within which different points projected on to the Please cite this article in press as: P. Aflaki et al., Simultaneou s 2D and 3D perce J. Vis. Comm un. (2013), http ://dx.doi.org /10.1016/ j.jvcir.2013 .03.014 left and right retinas produce binocular fusion and sensation of depth. Hence, horizontal disparity should be limited within Panum’s fusional area. Otherwise, excessive disparity could cause double vision or severe visual fatigue. The limits of Panum’s fusional area are affected by many factors e.g. including stimulus size, spatial frequency, exposure duration, temporal effects, continuous features, and amount of luminance [21]. Disparities beyond 60–70 arcmin are assumed to cause visual discomfort and eye strain [23,24]. Camera separation creates a disparity between the same object on the leftand right-view images on a display, which can be expressed in terms of number of pixels. Based on the display width and resolution, the disparity can be converted from a number of pixels to a distance disparity e.g. in centimeters as shown in (1) and (2). W 1⁄4 Wcm=Wpixels ð1Þ where Wcm is the display width in cm and Wpixels is the display width in pixels. Hence, w presents one pixel width in cm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Effect of Applying 2D Enhancement Algorithms on 3D Video Content

Enhancement algorithms are typically applied to video content to increase their appeal to viewers. Such algorithms are readily available in the literature and are already widely applied in, for example, commercially available TVs. On the contrary, not much research has been done on enhancing stereoscopic 3D video content. In this paper, we present research focused on the effect of applying enha...

متن کامل

EEG based evaluation of stereoscopic 3D displays for viewer discomfort

BACKGROUND Consumer preference is rapidly changing from 2D to 3D movies due to the sensational effects of 3D scenes, like those in Avatar and The Hobbit. Two 3D viewing technologies are available: active shutter glasses and passive polarized glasses. However, there are consistent reports of discomfort while viewing in 3D mode where the discomfort may refer to dizziness, headaches, nausea or sim...

متن کامل

Investigating the cross-compatibility of IR-controlled active shutter glasses

Active Shutter Glasses (also known as Liquid Crystal Shutter (LCS) 3D glasses or just Shutter Glasses) are a commonly used selection device used to view stereoscopic 3D content on time-sequential stereoscopic displays. Regrettably most of the IR (infrared) controlled active shutter glasses released to date by various manufacturers have used a variety of different IR communication protocols whic...

متن کامل

Article 4 Comparison of S3D Display Technology on Image Quality and Viewing Experiences: Active-Shutter 3D TV vs. Passive-Polarized 3D TV

Background: Stereoscopic 3D TV systems convey depth perception to the viewer by delivering to each eye separately filtered images that represent two slightly different perspectives. Currently two primary technologies are used in S3D televisions: Active shutter systems, which use alternate frame sequencing to deliver a full-frame image to one eye at a time at a fast refresh rate, and Passive pol...

متن کامل

Optical Characterization of Shutter Glasses Stereoscopic 3D displays

A method to characterize time sequential stereoscopic 3D displays which is based on the measurement of the temporal behavior of the system versus grey levels is presented. OPTISCOPE SA, especially designed for precise measurement of luminance and temporal behavior of LCD displays is used. The transmittance and response time of the shutter glasses is first evaluated. Then the grey to grey respon...

متن کامل

Efficient Steoroscopic Rendering in Virtual Endoscopy Applications

Optical endoscopy suffers from several problems which increase the difficulties of a successful intervention. Among these problems are the limited spatial or depth perception, and the fish-eye effect which virtually flattens the geometry of the anatomical structures. Standard virtual endoscopy is inflicted by similar problems. In this paper, we present stereoscopic VIVENDI, an approach for the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Visual Communication and Image Representation

دوره 25  شماره 

صفحات  -

تاریخ انتشار 2014